Back

Frontiers in Genetics

Frontiers Media SA

Preprints posted in the last 7 days, ranked by how well they match Frontiers in Genetics's content profile, based on 197 papers previously published here. The average preprint has a 0.32% match score for this journal, so anything above that is already an above-average fit.

1
Investigating the Y chromosome in complex disease: Phenome-wide scan across 104,334 Finnish men

Preussner, A.; Leinonen, J. T.; FinnGen, ; Pirinen, M.; Tukiainen, T.

2026-06-10 genetic and genomic medicine 10.64898/2026.06.09.26355235 medRxiv
Top 2%
3.5%
Show abstract

Although the Y chromosome represents roughly 2% of the male genome, it is often ignored in genome-wide association studies (GWAS). Subsequently, the potential health impacts of Y-chromosomal genetic variation remain incompletely understood. To fill this gap, we performed a phenome-wide association study (PheWAS) in FinnGen across 1,426 binary and quantitative traits using Y-chromosomal variation (frequency [&ge;] 1%) in 104,334 genotyped men. As Y chromosome variation is prone to population stratification, we performed carefully adjusted association analyses and further examined these through kin-based validation in 19,275 female and 24,712 male 1st degree relatives. We found 121 suggestive (p < 5.6x10-3) phenotypic associations in the Y chromosome, yet none of these were strong enough to reach phenome-wide significance (p < 3.9x10-6). While only 38 associations were supported in the kin-based validation, intriguingly we found support for a previously suggested link between haplogroup I1 and coronary heart disease (CHD; OR=1.06, 95%CI=1.02-1.11, p=3.7x10-3; male validation OR=1.05; female validation OR=0.97). The I1-CHD association was detected across distinct geographical areas within Finland and was independent from Loss of Y (LOY) and the autosomal risk to CHD, proposing a link between germline Y-chromosomal variation and heart disease risk. Overall, this study presents a comprehensive phenome-wide analysis of Y-chromosomal associations, highlighting the potential relevance of Y-chromosomal variation beyond sex determination. Our findings further emphasize the need for improved capture of Y-chromosomal variants and further analyses in biobank-scale data to allow for deeper exploration of male-specific genetic architecture of complex diseases.

2
Contextualizing the Utility of Polygenic Risk Scores using Absolute Risk Models in Diverse Ancestry Populations

Chatterjee, N.; Martina, F.; Kachuri, L.; Natarajan, P.; Witte, J.; Huo, D.

2026-06-04 genetic and genomic medicine 10.64898/2026.06.03.26354842 medRxiv
Top 4%
1.7%
Show abstract

Polygenic risk scores (PRSs) are emerging as powerful tools for quantifying inherited risk for common diseases and, in some cases, are approaching clinical implementation. A major concern for PRS implementation is their limited accuracy in non-European populations, particularly in those of African ancestry. However, past evaluations have focused on metrics such as relative risk or AUC, which do not capture background risk arising from contextual factors. We introduce a novel measure of variable importance, the conditional average derivative estimator (CADE), to evaluate PRS utility across diverse contexts and populations within absolute risk models that integrate PRSs with other relevant risk factors. We illustrate this framework by integrating PRSs for breast and prostate cancer within age-specific absolute risk models for incidence and mortality fit using individual-level data from the All of Us Research Program with inputs from the National Cancer Institute SEER cancer registry. Our projections show that although the PRSs are known to have the lowest discriminatory accuracy in African Americans (AA), there are contexts in which they provide greater utility, such as for the stratification of prostate cancer risk and mortality, where the CADE values for AA were 2- and 7-fold higher than for European Americans. These findings suggest that conclusions about the limited clinical utility of PRS in non-European populations may be premature and underscore the need to quantify PRS risk-stratification utility at the absolute-risk level, while accounting for disease onset, survival, and broader health and economic factors.

3
STELLAR: A flexible ensemble learning framework integrating rare variants to enhance polygenic risk prediction

Chen, T.; Li, X.; Mazumder, R.; Zhang, H.; Lin, X.

2026-06-09 genetic and genomic medicine 10.64898/2026.06.07.26355109 medRxiv
Top 4%
1.7%
Show abstract

Whole-exome and whole-genome sequencing technology has enabled the discovery of rare genetic variants associated with human health and diseases. However, existing statistical methods used for rare variant association testing are not well-suited for building genetic risk prediction models that jointly incorporate rare and common variants. We propose STELLAR, a flexible ensemble learning-based approach to compute rare variant polygenic risk scores (PRS) using association summary statistics to enhance conventional common variant PRS. Our method combines burden-based and penalty-based rare variant analysis and leverages functional annotation information to prioritize potentially causal variants within the prediction models. In simulation studies, PRS using STELLAR consistently showed the highest prediction accuracy compared to models using common variants alone or rare variant burdens. Applied to UK Biobank whole-exome sequencing data (n=310,831) across eight continuous and five binary traits, STELLAR significantly improved prediction accuracy, refined stratification of individuals at the highest genetic risk beyond common variants, and prioritized biologically relevant genes. STELLAR provides a scalable strategy to incorporate rare variants into PRS in addition to common variants, advancing precision risk prediction and enabling more comprehensive assessment of genetic contributions to complex diseases.

4
Prioritizing embryos with lower homozygosity may reduce disease risk in children of related individuals undergoing preimplantation genetic testing

Wolfram, T.; Ahangari, M.; Davidson, I.; Wartschinski, L.; Li, J. H.; Eyre, M.; Stern, D.; Schleede, J.; Haghighi, A.; Carmi, S.; Christensen, M.

2026-06-04 genetic and genomic medicine 10.64898/2026.05.30.26354526 medRxiv
Top 5%
1.7%
Show abstract

Consanguinity is a reproductive union between individuals who share a recent common ancestor. These unions are common in many regions of the world and increase the burden of rare recessive disorders by elevating autozygosity in offspring. Current reproductive genetic screening focuses on a limited set of known pathogenic variants, leaving most recessive risk unaddressed. Here we argue that embryo-level autozygosity, quantified as the fraction of the genome in long runs of homozygosity (FROH), is a potentially actionable genomic biomarker that can be integrated into routine preimplantation genetic testing as a homozygosity-informed embryo-prioritization framework (PGT-H) that can be layered onto existing embryo biopsy workflows when couples are already undergoing IVF with PGT-A or PGT-M. Using forward simulations of first-cousin and double-first-cousin couples, we show that siblings conceived by the same couple span a wide range of FROH; selecting the lowest-FROH candidate from a cohort of five embryos reduces FROH by approximately 40% on average. Combining these reductions with empirical effect-size estimates, we estimate that for first-cousin couples this strategy could reduce risk of intellectual disability by roughly 35-45% (corresponding to an absolute risk reduction of about 1.8-2.2%) and potentially reduce excess recessive disease burden, while also modestly reducing risk of common diseases such as type 2 diabetes. We outline how existing PGT-A and PGT-M workflows could potentially be extended to report embryo-level FROH and discuss ethical and counseling considerations. Autozygosity-based embryo prioritization offers a principled way to address a component of recessive risk that current variant-centric approaches miss.

5
Investigation of the continuous spread of SARS-CoV-2 in the post pandemic time - Insights into the reason for the sustained spread despite the establishment of population immunity

Yi, B.

2026-06-08 epidemiology 10.64898/2026.06.05.26355009 medRxiv
Top 5%
1.7%
Show abstract

In spite of well-established global immune landscape, SARS-CoV-2 is still able to further spread and continue causing infection waves. The current understanding about the reason behind is limited, and it is still difficult to predict the evolution or spreading tread of SARS-CoV-2. Therefore, it is necessary to investigate whether the establishment of population immunity has changed the virus evolution or spreading pattern. In this investigation, one overall analysis of the SARS-CoV-2 spreading in the past several years have been carried out through one thorough genomic epidemiology study, with Germany being chosen as one representative location in view of the systemic efforts for genomic surveillance. The growth advantage of a few predominant variants in its early spreading period has been evaluated through a logistic regression model. The results have revealed that the major circulating SARS-CoV-2 variants since 2023 are mainly derived from the Omicron BA.2 family. Since middle of 2024, most predominant variants were produced primarily through recombination, indicating that the evolution derived from recombination might be the major driving force for the continuous spread of SARS-CoV-2 despite the existence of population immunity. Furthermore, the lower growth advantage of recently emerged variants might possibly lead to a tread of reduction in the frequency of infection wave. The information revealed from this investigation suggests that although short-term spreading tread can be affected by specific virus feature as well as local immunity landscape, the long-term spreading tread is mainly decided by the genomic diversity of the viruses, and can be predicted through phylogenetic and genomic epidemiology investigation. The results have emphasized the importance of maintaining the efforts for genomic surveillance of SARS-CoV-2, which is essential from both medical and research perspectives.

6
Prevalence of pfkelch13 Mutations and Clinical Indicators of Artemisinin Partial Resistance in Africa: A Systematic Review and Meta-Analysis of Observational Cohorts

Munyangi wa Nkola, J.; Akilimali Zalagile, P.; Lukuke Mbutshu, H.; Kabala Munyemo, S.; Ramazani Bin Eradi, I.; CAMARA, A.

2026-06-10 genetic and genomic medicine 10.64898/2026.06.04.26354685 medRxiv
Top 7%
1.1%
Show abstract

Background: Artemisinin-based combination therapies remain the mainstay of malaria control strategies; nevertheless, the advent of genetic markers linked to partial artemisinin resistance in Plasmodium falciparum has elicited substantial concern across African settings. To assess the prevalence, geographic distribution, and clinical associations of these molecular markers, we undertook a systematic review and meta-analysis of observational cohort studies.Methods: We conducted a search of cohort studies published between January 2015 and June 2025, following PRISMA 2020 guidelines. We queried databases including PubMed/MEDLINE, Scopus, Web of Science, and CINAHL. Eligibility required prospective enrollment of patients, longitudinal monitoring (therapeutic efficacy studies), and pfkelch13 propeller domain genotyping.Results: A meta-analytical synthesis of 888 isolates from six core prospective cohorts revealed a pooled prevalence of 6% (95% CI: 2.1%-11.8%) for validated pfkelch13 mutations. A profound geographic dichotomy was identified: while West and Central African cohorts maintained a 0% prevalence, East African hotspots showed significant expansion, with prevalence reaching 12.8% in Rwanda and up to 25.5% in Northern Uganda; high statistical heterogeneity (, ) reflects this biological divergence. Conclusions: These findings highlight the established and expanding presence of artemisinin partial resistance in East Africa. Standardized surveillance is essential to adapt malaria control policies across the continent. Keywords: Africa; artemisinin resistance; clinical indicators; pfkelch13 gene; molecular markers; partial resistance; Plasmodium falciparum.

7
Recovery Trends Show Greater Quadriceps Weakness After Patellar Tendon Versus Hamstring Autografts in ACL Reconstruction

Wilebski, B.; Bond, C. W.; Noonan, B. C.

2026-06-10 sports medicine 10.64898/2026.06.08.26355177 medRxiv
Top 8%
0.9%
Show abstract

Context: Although knee extensor and flexor strength deficits are well-documented after anterior cruciate ligament reconstruction, limited data exist characterizing how strength recovery evolves over time. Understanding the temporal patterns of recovery, and how they differ by autograft type, is critical for optimizing rehabilitation and return-to-sport decision-making. Objective: To characterize temporal trends in knee extensor and flexor strength recovery during the first year post-ACLR and evaluate differences between patellar tendon and hamstring tendon autografts. Design: Case series. Setting: Sports physical therapy clinics within a large health system. Participants: Five hundred three patients (17.8 {+/-} 3.0 y) who underwent primary reconstruction with either patellar tendon or hamstring tendon autografts and completed a combined 730 return-to-sport tests within 12 months postoperatively. Main Outcome Measures: Normalized peak isokinetic concentric knee extension and flexion torques for involved and uninvolved limbs, and normalized symmetry indices for knee extension and flexion strength. Results: Knee extension strength on both limbs and extension strength symmetry improved over time. Patients with hamstring autografts demonstrated superior involved leg knee extension strength and better extension strength symmetry compared with those receiving patellar tendon autografts, although uninvolved leg strength was similar between autografts. Knee flexion strength on both limbs and flexion strength symmetry also improved over time. Patellar tendon autograft patients exhibited greater strength symmetry, despite no between autografts for flexion strength for the involved or uninvolved limb. Conclusions: Autograft significantly influences muscle strength recovery following anterior cruciate ligament reconstruction. Hamstring tendon autografts are associated with superior recovery of knee extension strength and strength symmetry compared to patellar tendon autografts. These findings underscore the need for graft-specific rehabilitation strategies and earlier identification of patients at risk for delayed recovery.

8
Reproductive health in Mexican women with systemic lupus erythematosus: pregnancy outcomes, menstrual irregularities and early menopause

Sevilla-Parra, G.; Bravo-Garcia, F.; Mier y Teran Guevara, M.; Montes-Garcia, A.; Schäfer, A.; Ochoa-Rodriguez, N.; Bienvenu Caballero, M.; Gonzalez Zenteno, S. G.; Pena-Ayala, A.; Tinajero-Nieto, L.; Torres-Valdez, E.; Martinez, D.; Hernandez-Ledesma, A. L.; Medina-Rivera, A.; Alpizar-Rodriguez, D.

2026-06-09 sexual and reproductive health 10.64898/2026.06.07.26354004 medRxiv
Top 9%
0.8%
Show abstract

Objective: To characterize pregnancy outcomes and menstrual irregularities in Mexican women with systemic lupus erythematosus (SLE) and identify clinical factors associated with adverse pregnancy outcomes and early-onset menopause. Methods: We conducted a cross-sectional study of women with SLE enrolled in the Mexican Lupus Registry (LupusRGMX) between May 2021 and September 2024. Clinical and reproductive data were collected using standardized questionnaires. Menopause was defined as the absence of menstruation for [&ge;]12 consecutive months, and early menopause as onset before age 40. Univariable and multivariable logistic regression analyses were used to identify factors associated with pregnancy complications and early menopause. Results: A total of 210 women were included. Median age was 38 years (IQR 29-46) and median disease duration was 4 years (IQR 1-10). Among women with a history of pregnancy (47%), full-term delivery predominated (61%), while pregnancy loss occurred in 26% and preterm delivery in 13%. Pregnancy complications were reported in 9.6%, most commonly preeclampsia (6.7%). Younger maternal age was independently associated with pregnancy complications (OR 0.89, 95% CI 0.83-0.95) and adverse outcomes (OR 0.95, 95% CI 0.92-0.98). Higher disease activity was associated with complications in univariable analysis. Most pregnancies (68.3%) occurred before diagnosis. Early menopause was observed in 6.2% and independently associated with longer disease duration and older age. Conclusion: Younger maternal age was independently associated with adverse pregnancy outcomes, whereas disease activity showed an association in univariable analysis. Most pregnancies occurred prior to SLE diagnosis. Early menopause was associated with longer disease duration, suggesting impact of cumulative disease burden on ovarian function.

9
Breast cancer polygenic risk score performance varies by socioeconomic status

Domian, H. I.; Tian, X.; Ong, D.; Hamilton, L.; Shieh, Y.; Musharoff, S. A.

2026-06-04 genetic and genomic medicine 10.64898/2026.06.03.26354819 medRxiv
Top 10%
0.7%
Show abstract

Background: Polygenic risk scores (PRS) for breast cancer are increasingly used for risk stratification to inform screening and prevention. However, for PRSs to be equitable and clinically useful, they need to perform well across diverse populations. While PRS performance is known to be ancestry-dependent, it is not well understood how environmental context, such as that of socioeconomic status (SES), affects PRS transferability. Here, we assess whether SES, measured via self-reported household income, modifies breast cancer PRS performance and, if so, whether socioeconomic context contributes predictive information beyond genetic risk alone. Methods: We used the US-based All of Us biobank to evaluate how SES impacts breast cancer PRS performance. First, we quantified changes in breast cancer PRS performance by modeling a commonly-cited polygenic score for breast cancer previously described by Mavaddat et al. with SES. We then reestimated the genetic effect sizes of the 3,820 variants from Mavaddat et al. in All of Us with and without income as a covariate. Because social determinants of health affect breast cancer detection and outcomes, we stratified analyses by socially defined populations on the basis of self-identified race and ethnicity. We further stratified individuals whose self-identified race is White (''White'') into three SES groups (high, middle, low) based on self-reported income and re-estimated genetic effect sizes to create SES-specific PRSs. We then applied these PRSs to White participants, the largest group in the study, and to Black or African American (''Black'') and Hispanic or Latino (''Hispanic'') participants, groups underrepresented in breast cancer research. Model discrimination between cases and controls was measured by area under the curve (AUC). Results: We analyzed 163,715 women from the All of Us biobank, which included 8,833 breast cancer cases (6,619 White, 1,178 Black, and 1,036 Hispanic), with relative income available for a subset of these cases (5,525 White, 848 Black, and 566 Hispanic). The ancestry-dependent performance of the breast cancer PRS described in Mavaddat et al. was replicated in All of Us. In Black individuals, this PRS (AUC and 95% CI: 0.576 [0.571, 0.582]) produced a similar increase in AUC as relative income (AUC: 0.573 [0.568, 0.577]) when added to an age-only model. Incorporating income with PRS, age, and genetic PCs 1-3 improved AUC by 0.007 in White Americans and 0.018 in Black Americans (both p < 10-11), while attenuating the contribution of PRS in the full model. PRS performance also varied among SES categories. Notably, PRSs with variant effect sizes that were recalibrated in low-SES White participants performed best in low-SES White participants (AUC: 0.605 [0.583, 0.628]) and Black Americans (AUC: 0.588 [0.586, 0.591]), both better than performance in high-SES White Americans (AUC: 0.579 [0.577, 0.580]) and middle-SES White Americans (AUC: 0.578 [0.569, 0.586]). Conclusion: Socioeconomic context, measured by income, significantly impacts the transferability of a PRS for breast cancer within and among groups defined by self-identified race and ethnicity. Accounting for SES improves PRS performance, most notably in Black Americans and low-SES White individuals.

10
Shared epigenetic regulation acting on neuroimmune pathways contributes to the comorbidity between generalized anxiety disorder and COVID-19

Karaca, S.; Cabrera Mendoza, B.; He, J.; Qiu, D.; Davtian, D.; Lacobelle, A.; Nunez, Y. Z.; Krystal, J. H.; Pietrzak, R. H.; Gelernter, J.; Polimanti, R.

2026-06-04 genetic and genomic medicine 10.64898/2026.06.03.26354830 medRxiv
Top 10%
0.7%
Show abstract

Background: The biological mechanisms linking generalized anxiety disorder (GAD) and COVID-19 remain poorly understood, despite substantial evidence of their comorbidity. To address this gap, we examined genetic and epigenetic factors underlying their co-occurrence. Methods: In a multi-ancestry sample of 893 participants, we conducted genome-wide and epigenome-wide analyses of GAD and COVID-19 severity. Integrating large-scale genome-wide datasets and information regarding methylation quantitative trait loci, complementary analytic approaches were used to identify regional methylation patterns, assess genetically regulated DNA methylation in blood and brain tissue, and evaluate causal loci shared between GAD and COVID-19. Results: GAD was associated with epigenome-wide significant variation in loci involved in chromatin regulation and synaptic signaling. Conversely, COVID-19-related epigenetic signals were enriched in immune-inflammatory and host-response pathways. Mild COVID-19 was epigenetically related to endothelial-inflammatory signals, while severe COVID-19 was linked to epigenetic changes implicated in myeloid and thrombo-inflammatory pathways. Epigenetic signals shared between GAD and COVID-19 implicated processes related to stress adaptation and tissue homeostasis. Genetically informed analyses identified 60 shared loci, including MAPT, ZFP57, and FBXL18, indicating pleiotropy between GAD and COVID-19 in genetically regulated DNA methylation variation. Brain-specific analyses further highlighted convergence in additional loci (i.e., MICB and HLA-DPB1), suggesting neuroimmune mechanisms underlying GAD-COVID-19 shared methylation patterns. Conclusions: These findings support that GAD and COVID-19 share epigenetic and genetic architecture involving pathways related to vascular integrity, immune function, and cellular adaptation, highlighting a potential neuroimmune basis for their co-occurrence.

11
Watching the FIFA World Cup and Adult Sleep Quality: A Cross-Sectional Online Survey

Aljamaan, F.; Alanteet, A. A.; Chaiah, Y.; Dasuqi, S. A.; Alarabi, M. A.; Saeed, E.; Al-khatib, S. M.; Darweesh, A. A.; Raina, M.; Saad, K.; Alhasan, K.; BaHammam, A. S.; Temsah, M.-H.

2026-06-08 sports medicine 10.64898/2026.06.07.26355072 medRxiv
Top 11%
0.7%
Show abstract

Major international sporting events frequently impose exogenous demands that challenge adult circadian rhythms, often leading to the misalignment of sleep-wake cycles and social schedules. This cross-sectional study investigated the impact of the FIFA 2022 World Cup on adult sleep patterns to assess the prevalence and determinants of tournament-associated circadian disruption. Through an online survey, we captured data on sleep duration, timing, and subjective quality from a diverse adult population using Pittsburgh Sleep Quality Index (PSQI) score. The results indicate that 81.3% had high problematic sleep according to PSQI scores, while only 9% perceived that their sleep pattern was impacted by watching matches during the tournament. While 83.7% of the participants had low or mild anxiety according to GAD-7 scores, we found that GAD-7 scores correlated significantly with PSQI scores. Married participants had significantly lower PSQI scores (RR 0.856, p = .005), while those who reported that their sleep hours had changed during the tournament had significantly higher PSQI scores (1.180, P-value <0.001). Males reported a significantly high impact of the tournament on their sleep (OR 2.622, P-value <0.001). In conclusion, our data demonstrate a discrepancy between self-perception of sleep quality and self-rated assessment by PSQI scores, as well as the substantial impact of major international sporting events on adult sleep hygiene. The results provide data-driven insights helpful in evaluating potential circadian risks and informing public health strategies for major sporting events such as the FIFA world cup.

12
Context-Dependent Age-Group performance hierarchies limit fairness interventions in PPG-based heart rate prediction

Panchumarthi, L. Y.; Kataria, S.; Wu, Y.; Hu, X.; Fedorov, A.; Kwak, H. G.

2026-06-05 health informatics 10.64898/2026.06.04.26352929 medRxiv
Top 14%
0.4%
Show abstract

Background. Fairness-aware machine learning increasingly targets demographic performance disparities in clinical prediction, yet whether standard bias mitigation strategies genuinely improve equity in physiological signal analysis remains unclear. Age-based disparities in photoplethysmography (PPG)-based heart rate prediction present a particular challenge, as age-related performance differences may reflect context-dependent physiological structure rather than correctable artifacts. Methods. We evaluated three fairness interventions, inverse-frequency weighting (IF), Group Distributionally Robust Optimization (GroupDRO), and adversarial debiasing (ADV), applied via fine-tuning of a PPG foundation model across three clinical datasets spanning intensive care unit, laboratory, and consumer wearable contexts. Outcomes were assessed using a 2x2 framework classifying each intervention-dataset combination by the joint direction of change in mean absolute error (MAE) and fairness gap (FG) across age groups, yielding four outcome types: genuine improvement (G), leveling down (L), selective benefit (S), and both worse (W). Results. Across nine intra-domain conditions, no intervention simultaneously improved both MAE and FG (0/9 genuine improvement). The dominant pattern was leveling down (5/9): FG decreased but was accompanied by MAE degradation, indicating that apparent fairness gains were achieved at the cost of overall predictive performance. Age-group difficulty ordering varied across clinical contexts at baseline and was not preserved under intervention. In 18 cross-domain transfer conditions, genuine improvement was rare (4/18) and observed exclusively in non-MIMIC source configurations; models fine-tuned on MIMIC-sourced data yielded no genuine improvements (0/6). Embedding-level representation changes following fine-tuning did not reliably predict fairness outcomes. Conclusions. Age-based fairness interventions in PPG heart rate prediction indicate a leveling-down pattern rather than genuine equity improvement, suggesting that age-related performance gaps reflect context-dependent physiological structure not fully addressable through standard bias mitigation. Cross-domain transfer further amplifies this instability. These findings suggest that fairness evaluation frameworks for age-stratified physiological prediction should account for context-dependent performance structure rather than treating observed gaps as correctable bias.

13
Human genetic evidence links serine biosynthesis to diabetic peripheral neuropathy

Fridman, V.; Kakar, A.; Jensen, A.; Van de Vondel, L.; Wheeler, A.; Phillips, L. S.; Zhou, J.; Zuchner, S.; Reusch, J.; Raghavan, S.

2026-06-10 genetic and genomic medicine 10.64898/2026.06.09.26355286 medRxiv
Top 14%
0.4%
Show abstract

Diabetic peripheral neuropathy (DPN) is a common and disabling condition for which no disease-modifying therapies are available. Glycemic and metabolic drivers do not fully explain why only a subset of individuals with diabetes develop DPN, and genetic contributors remain poorly defined. We aimed to perform a multi-population genome-wide association study (GWAS) of DPN to highlight potential new etiological pathways and therapeutic targets. Methods We performed a multi-population GWAS of neuropathy in people with and without diabetes using the VA Million Veteran Program and UK Biobank, followed by replication in the All of Us Research Program (AoU), and gene-based and gene-set analyses to identify implicated pathways. Causal relationships between circulating serine levels and DPN were further tested using two sample Mendelian randomization. To further evaluate pathogenic potential, we analyzed rare, high impact variants in GWAS implicated genes among individuals with unresolved inherited neuropathies using the GENESIS platform. Findings Among individuals with type 2 diabetes, we identified seven genome wide significant loci (p<5x10-): PHGDH and PSPH (key serine synthesis genes), TEAD1, CYP4F11, LARGE1, FTO, and COBLL1. No loci were significant in individuals without diabetes or with type 1 diabetes. Four loci (PHGDH, TEAD1, FTO and CYP4F11) replicated in AoU (p <0.05). Mendelian randomization demonstrated that higher genetically predicted serine levels were associated with lower DPN risk, consistent with a causal role of serine metabolism in disease pathogenesis. Rare-variant burden analyses revealed associations of predicted deleterious variants with inherited neuropathy case status in PHGDH (odds ratio [OR] 12.7 [95% CI 7.9, 20.4]), PSPH (OR 8.5 [7.2, 10.2]), PHKG1 (OR 4.8 [3.7, 6.3]), and LARGE1 (OR 0.007 [0.0004, 0.1]). Interpretation Convergent genetic evidence across common and rare variation implicates serine synthesis as a key pathway in DPN. These findings link diabetic and inherited neuropathies through a shared metabolic mechanism, identifying serine metabolism as a potential therapeutic target.

14
Whole-exome-based preconception carrier screening in Uzbekistan with targeted SMA, FMR1, and DMD assays: the first reported clinical program

Kullyev, A.; Avdeichik, S.; Akimenkova, A.; Kartuesov, A.; Kardymon, O.; Goikhman, Y.

2026-06-04 genetic and genomic medicine 10.64898/2026.06.02.26354713 medRxiv
Top 15%
0.3%
Show abstract

Abstract Purpose: Published clinical outcome data on preconception carrier screening (PCS) in Central Asia are limited. We report the first clinical implementation study from Uzbekistan of a whole-exome sequencing (WES)-based multi-platform PCS program combining exome sequencing with targeted SMA, FMR1, and DMD assays. Methods: We retrospectively analyzed anonymized data from 65 individuals (19 couples, 27 singletons) screened at IMC Genomics, Tashkent, between January 2024 and May 2026. WES covering the protein-coding regions of approximately 20,000 genes was followed by exome-wide bioinformatics filtering and clinical geneticist interpretation. Partly overlapping cohorts underwent SMA carrier screening (n=179), FMR1 CGG-repeat analysis in females (n=155), and DMD deletion/duplication testing in preconception females (n=29). Variants were classified by ACMG/AMP criteria against gnomAD v4.1. Results: Sixty-one of 65 WES-screened individuals (93.8%; 95% CI 85.2 - 97.6%) carried at least one reportable variant (152 instances across 126 genes). Four of 19 couples (21.1%; 95% CI 8.5 - 43.3%) were concordant for pathogenic or likely pathogenic variants in the same autosomal recessive gene; two were referred for preimplantation genetic testing for monogenic disease. SMA screening identified four carriers, including two 2+0 silent carriers; FMR1 analysis identified one intermediate allele; DMD MLPA identified no exonic rearrangements. Conclusion: This first reported WES-based multi-platform PCS program in Uzbekistan was feasible and clinically informative, identifying actionable couple-level reproductive risks and supporting structured implementation of reproductive genetic screening in Central Asia.

15
Prevalence and factors associated with peripheral artery disease among patients with diabetes mellitus: A cross-sectional study at tertiary hospital in Eastern Uganda

Imalingat, J.; Muyinda, A.; Iraguha, D.; Katuramu, R.; Masaba, P.; Apio, E.; Kebesu, J.; Nankunda, O.; Kirabo, E.; Epuitai, J.; Bwayo, D.

2026-06-05 cardiovascular medicine 10.64898/2026.06.03.26354843 medRxiv
Top 16%
0.3%
Show abstract

Abstract Background Peripheral artery disease (PAD) is a major contributor to morbidity and mortality, particularly among individuals with diabetes mellitus (DM), in whom its prevalence is markedly increased. PAD is often asymptomatic and under-diagnosed, especially in low-resource settings. This study aimed to determine the prevalence of PAD and associated factors among adults with DM in Eastern Uganda. Methods We conducted a hospital-based cross-sectional study at Mbale Regional Referral Hospital from 10th/12/ 2024 to 30th/4/2025. A total of 300 adult patients with DM were consecutively enrolled. Data on sociodemographic characteristics, clinical characteristics, comorbidities, and behavioural risk factors were collected using an interviewer-administered data tool. PAD was assessed using the ankle-brachial index (ABI), defined as [&le;] 0.90. Modified Poisson regression was used to identify factors associated with PAD. As a secondary measure for PAD, we administered the Edinburgh Claudication Questionnaire (ECQ) to capture symptomatic PAD. Results The majority of the participants had a low fruit intake (68%), physical inactivity (54%), and elevated low-density lipoprotein (60%). The prevalence of PAD as measured by ABI was 42.3% (127/300; 95% CI 0.38-0.48), while the magnitude of PAD as measured by ECQ, combining participants with possible claudication and definite claudication was 37.3% 95% CI 31.9 - 42.8). Out of participants with PAD, 15.8% (20/127) were classified as having severe PAD (ABI <0.4). Socio-demographic and clinical factors were assessed for association with PAD. We found no evidence of association between the examined factors such as age (aPR 1.24 95% CI 0.73 - 2.09), sex (aPR 1.46 95% CI 0.84 - 2.55), cholesterol level (aPR 1.39 95% CI 0.86 - 2.25), glycemic control (aPR 1.35 95% CI 0.72 - 2.53), and sedentary behaviour (aPR 1.28 95% CI 0.79-2.08) and PAD. Conclusion The prevalence of PAD was high among adults with DM in Eastern Uganda. Routine health education, and ABI screening of PAD should be done for patients living with DM. The absence of significant associations despite high prevalence of PAD may reflect unmeasured factors e.g. chronic inflammation that may be unique to this population, future prospective studies with larger sample size and more detailed objective measures e.g. inflammatory markers are needed to determine locally relevant modifiable risk factors.

16
Understanding Human AI Discrepancy in Breast Cancer TIL Assessment: A Multi-Rater and Perceptual Bias Study

Capar, A.; Aloglu, I.; Aker, F.; Ertano, M.; Mese, Y. E.; Ungor, A.; Yildiz, B. E.

2026-06-04 pathology 10.64898/2026.05.29.26354196 medRxiv
Top 16%
0.3%
Show abstract

Objective: Tumor-infiltrating lymphocytes (TILs) in breast cancer are one of the most important indicators of the immune response within the tumor microenvironment. They play a particularly significant prognostic and predictive role in triple-negative and HER2-positive subtypes. However, substantial inter-observer variability has been reported in TIL scoring among pathologists, which limits its reliability in clinical practice. The aim of this study was to evaluate the agreement between artificial intelligence (AI) models and pathologists in TIL scoring and to compare this agreement using different statistical approaches, thereby assessing the potential of AI integration into pathology practice. Materials and Methods: Digitized histopathological images of breast cancer cases were included in the study. Tumor regions annotated by pathologists were evaluated for both stromal TIL percentage and the proportion of stromal tumor area within each ROI, with assessments performed independently by three pathologists and two AI models. Agreement was assessed among pathologists, between pathologists and AI, and between AI models. Statistical analyses included intraclass correlation coefficient (ICC), Cohen and Fleiss kappa, correlation tests, and Bland-Altman analysis. In addition, categorical agreement was examined using different cut-off values. Results: Inter-pathologist agreement was high, with an ICC of 0.81. In contrast, the global agreement between pathologists and AI models was lower (ICC 0.41). Pairwise comparisons of pathologist-AI agreement yielded substantially lower ICC values (0.12-0.21), although this improved to 0.53 when three pathologists were assessed jointly with a single AI model. The strongest categorical agreement was observed with dichotomized TIL scores ([&le;]10% vs. >10%), whereas multi-category classifications were associated with a marked reduction in kappa values. Spearman correlation coefficients between pathologists and AI models ranged from moderate to good ({rho} = 0.48-0.81). Agreement between the two AI models themselves was moderate, with an ICC of 0.64

17
Rare neurological and neurodevelopmental variants in ALS link to onset, survival and family history

O'Donoghue, C.; Kacar, E.; Gomes, T.; Costello, E.; Pender, N.; Peelo, C.; Ryan, M.; Heverin, M.; Byrne, S.; Bede, P.; Hardiman, O.; McLaughlin, R. L.; Byrne, R. P.

2026-06-10 genetic and genomic medicine 10.64898/2026.06.09.26354977 medRxiv
Top 16%
0.3%
Show abstract

Background: Neurological, neuropsychiatric, and neurodevelopmental disorders cluster in ALS families, sharing a common genetic architecture with ALS. Pathogenic variants in genes associated with other neurological, neurodevelopmental, or neuropsychiatric disorders may also co-occur in ALS and modify phenotype. We have sought to determine the prevalence and clinical pattern of likely-pathogenic/pathogenic (LP/P) non-ALS neurological, neurodevelopmental, and neuropsychiatric variants, alone and in combination with ALS-gene variants, in two large ALS cohorts. Methods: Whole-genome sequencing (WGS) of 469 Irish and 774 Answer ALS people with ALS (pwALS) was analysed for ClinVar LP/P variants associated with other neurological (n = 15541), neurodevelopmental (n = 9761), and neuropsychiatric (n = 321) phenotypes. Inheritance patterns for associated genes (autosomal recessive/autosomal dominant) along with the associated phenotype were validated using OMIM. Standardised clinical data included family history, site and age of onset, El Escorial category, survival, motor decline, and cognitive and behavioural assessments. Known ALS-gene variants and C9orf72 repeat expansion status were included for each cohort. Results: Non-ALS neurological variants were identified in 47/469 (10.0%) Irish and 69/774 (8.9%) Answer ALS participants, most frequently in hereditary spastic paraplegia-associated genes (3.2% Irish; 2.8% Answer ALS). Irish neurological variant carriers showed higher frequency of respiratory onset (10.6% vs 1.2%, Fisher's exact p = 0.002, {Phi} = 0.20) and fewer premorbid behavioural symptoms (0.92 +/- 0.56 vs 3.08 +/- 0.97, Cohen's d = -0.40). Neurodevelopmental variants occurred in 12/469 (2.6%) Irish and 20/774 (2.6%) Answer ALS participants. In the Irish cohort, neurodevelopmental variant carriers had significantly shorter survival in Cox proportional hazards model (log-rank p = 0.005), corresponding to a more than two-fold increased hazard of death (HR = 2.25, 95% CI 1.26-4.00), and had significantly increased familial burden of neuropsychiatric disorders among first- and second-degree relatives (negative binomial IRR for carriers = 2.41, 95% CI: 1.12-5.18, p = 0.025). Across combined cohorts, 18 individuals (Irish n = 8; Answer ALS n = 10) carried [&ge;]2 LP/P variants spanning ALS and non-ALS genes. Conclusion: Rare LP/P variants in genes associated with other neurological and neurodevelopmental disorders occur in up to 12% of pwALS across two independent cohorts. Carriers show distinct phenotypes, shorter survival, and characteristic family history patterns. These findings suggest that extended pleiotropic and oligogenic architectures may contribute to ALS heterogeneity.

18
More Than Results: A Qualitative Study on the Role of Person-Centered Genetic Counseling in Parkinson Disease Research

Verbrugge, J.; Fiallos, K.; Cook, L.; Miller, M.; Head, K. J.

2026-06-09 genetic and genomic medicine 10.64898/2026.06.03.26354465 medRxiv
Top 16%
0.3%
Show abstract

As genetic testing becomes increasingly integrated into Parkinson disease (PD) research, including targeted testing for variants in LRRK2 and GBA1, the return of individual research results is becoming more common. However, limited qualitative data exists regarding how research participants experience genetic results disclosure and post-test genetic counseling in PD research settings. We conducted semi-structured qualitative interviews with participants (n=13) enrolled in the Parkinson Precision Medicine Initiative (formerly Parkinson Progression Markers Initiative; PPMI) who had received PD-related genetic test results and post-test genetic counseling. Interviews were conducted 1 to 3 weeks following result disclosure and analyzed using thematic analysis with a primarily deductive coding approach informed by study aims and inductive identification of emergent themes. Four primary themes were identified: (1) personal connection and motivations for participation, (2) centrality of result disclosure and information preferences, (3) emotional experiences and support needs, and (4) communication quality and alignment with participant needs. Overall, our findings underscore the importance of person-centered genetic counseling within PD research. As return of genetic and biomarker results in research and clinical trial contexts expand, thoughtful integration of relational, informational, and communication-focused practices will be essential to support participant engagement and trust.

19
KESOZI Digital Twin: Physics-Informed Neural Network for Independent Estimation and Prediction of Childhood Diarrheal Disease Burden in Kenya, Somaliland, and Zimbabwe

KESOZI Digital Twin, ; Agumba, J. O.; Namusonge, L.; Ogendo, J.; Hassan, M. A.; Pembere, A.; Takavarasha, M.

2026-06-04 epidemiology 10.64898/2026.06.03.26354823 medRxiv
Top 17%
0.3%
Show abstract

Childhood diarrheal disease remains a leading cause of morbidity and mortality among children under five years in sub-Saharan Africa, particularly in settings affected by inadequate sanitation, climate variability, malnutrition, and limited healthcare access. Conventional forecasting approaches are often constrained by sparse surveillance data, weak spatial representation, and limited incorporation of mechanistic disease dynamics. This study presents a Physics-Informed Multimodal Artificial Intelligence Digital Twin framework that integrates Physics-Informed Neural Networks, Graph Neural Networks, diffusion-reaction epidemiological modeling, multimodal fusion learning, and Digital Twin simulation to estimate and predict childhood diarrheal disease burden in Kenya, Somaliland, and Zimbabwe. Using public epidemiological, environmental, climate, sanitation, and synthetic proof-of-concept datasets, the framework modeled temporal disease dynamics, spatial transmission, pathogen-attributed burden, and outbreak trajectories while enforcing epidemiological consistency through physics-informed optimization. Results demonstrated robust forecasting performance, enhanced spatial transmission modeling, uncertainty-aware predictions, and realistic outbreak simulations across the three countries. Rotavirus, Shigella, and Cryptosporidium were identified as major contributors to modeled mortality burden, while unsafe water exposure, poor sanitation, malnutrition, and climate-sensitive transmission substantially increased disease risk. Compared with a Bayesian baseline model, the multimodal framework achieved superior nonlinear risk characterization, geospatial learning, and temporal prediction. These findings highlight the potential of scientific machine learning and digital twin systems for infectious disease surveillance, outbreak forecasting, climate-health analytics, and evidence-based public health decision-making in low-resource African settings. Keywords: Physics-Informed Neural Networks, Graph Neural Networks, Digital Twin, Childhood Diarrheal Disease, Epidemiology, Kenya, Somaliland, Zimbabwe, Scientific Machine Learning, Spatial Epidemiology, Multimodal Fusion

20
Burden of Chronic Kidney Disease in China, 1990-2021: Findings from the 2021 Global Burden of Disease Study

Wang, M.; Zhao, T.; Wang, H.; Hou, S.; Fu, Y.

2026-06-09 epidemiology 10.64898/2026.06.06.26355056 medRxiv
Top 17%
0.3%
Show abstract

Introduction: To investigate the epidemiological characteristics of chronic kidney diseases (CKD) in China in 2021 and its trends between 1990 and 2021, in the context of significant population growth and lifestyle changes over the past 30 years that have likely influenced the CKD spectrum. Methods: Data on CKD prevalence, mortality, disability-adjusted life-years (DALY), and risk factors were obtained from the Global Burden of Disease Study 2021. The estimated decadal percentage changes were calculated to evaluate changes in trends in prevalence, mortality and disease burden. Results: In 2021, an estimated 118.4 (95% UI 109.4 to 127.5) million people in China were affected by CKD, contributing to 204 230 (95% UI 164 736 to 246 372) deaths and 6.13 (95% UI 5.18 to 7.21) million DALY. Although CKD due to diabetes mellitus and hypertension accounted for less than a quarter of all cases, they were responsible for over 90% of CKD-related deaths. Over the past three decades, CKD mortality and DALY rates have steadily increased, although the prevalence has stabilized in the last decade. Diabetes mellitus type 2 and hypertension have emerged as key drivers of CKD burden in China. Conclusions: The CKD burden in China shows a dual pattern of rising incidence and high mortality from diabetes and hypertension-related chronic kidney disease, alongside persistently high years lived with disability from glomerulonephritis and other causes.